35 research outputs found
Prototypical Contrast Adaptation for Domain Adaptive Semantic Segmentation
Unsupervised Domain Adaptation (UDA) aims to adapt the model trained on the
labeled source domain to an unlabeled target domain. In this paper, we present
Prototypical Contrast Adaptation (ProCA), a simple and efficient contrastive
learning method for unsupervised domain adaptive semantic segmentation.
Previous domain adaptation methods merely consider the alignment of the
intra-class representational distributions across various domains, while the
inter-class structural relationship is insufficiently explored, resulting in
the aligned representations on the target domain might not be as easily
discriminated as done on the source domain anymore. Instead, ProCA incorporates
inter-class information into class-wise prototypes, and adopts the
class-centered distribution alignment for adaptation. By considering the same
class prototypes as positives and other class prototypes as negatives to
achieve class-centered distribution alignment, ProCA achieves state-of-the-art
performance on classical domain adaptation tasks, {\em i.e., GTA5
Cityscapes \text{and} SYNTHIA Cityscapes}. Code is available at
\href{https://github.com/jiangzhengkai/ProCA}{ProCA
You Only Need 90K Parameters to Adapt Light: A Light Weight Transformer for Image Enhancement and Exposure Correction
Challenging illumination conditions (low-light, under-exposure and
over-exposure) in the real world not only cast an unpleasant visual appearance
but also taint the computer vision tasks. After camera captures the raw-RGB
data, it renders standard sRGB images with image signal processor (ISP). By
decomposing ISP pipeline into local and global image components, we propose a
lightweight fast Illumination Adaptive Transformer (IAT) to restore the normal
lit sRGB image from either low-light or under/over-exposure conditions.
Specifically, IAT uses attention queries to represent and adjust the
ISP-related parameters such as colour correction, gamma correction. With only
~90k parameters and ~0.004s processing speed, our IAT consistently achieves
superior performance over SOTA on the current benchmark low-light enhancement
and exposure correction datasets. Competitive experimental performance also
demonstrates that our IAT significantly enhances object detection and semantic
segmentation tasks under various light conditions. Training code and pretrained
model is available at
https://github.com/cuiziteng/Illumination-Adaptive-Transformer.Comment: 23 page
Dynamic fusion with intra-and inter-modality attention flow for visual question answering
Learning effective fusion of multi-modality features is at the heart of
visual question answering. We propose a novel method of dynamically fusing
multi-modal features with intra- and inter-modality information flow, which
alternatively pass dynamic information between and across the visual and
language modalities. It can robustly capture the high-level interactions
between language and vision domains, thus significantly improves the
performance of visual question answering. We also show that the proposed
dynamic intra-modality attention flow conditioned on the other modality can
dynamically modulate the intra-modality attention of the target modality, which
is vital for multimodality feature fusion. Experimental evaluations on the VQA
2.0 dataset show that the proposed method achieves state-of-the-art VQA
performance. Extensive ablation studies are carried out for the comprehensive
analysis of the proposed method.Comment: CVPR 2019 ORA